Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
J Acoust Soc Am ; 155(3): 1694-1703, 2024 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-38426839

RESUMEN

Cochlear implant (CI) is currently the vital technological device for assisting deaf patients in hearing sounds and greatly enhances their sound listening appreciation. Unfortunately, it performs poorly for music listening because of the insufficient number of electrodes and inaccurate identification of music features. Therefore, this study applied source separation technology with a self-adjustment function to enhance the music listening benefits for CI users. In the objective analysis method, this study showed that the results of the source-to-distortion, source-to-interference, and source-to-artifact ratios were 4.88, 5.92, and 15.28 dB, respectively, and significantly better than the Demucs baseline model. For the subjective analysis method, it scored higher than the traditional baseline method VIR6 (vocal to instrument ratio, 6 dB) by approximately 28.1 and 26.4 (out of 100) in the multi-stimulus test with hidden reference and anchor test, respectively. The experimental results showed that the proposed method can benefit CI users in identifying music in a live concert, and the personal self-fitting signal separation method had better results than any other default baselines (vocal to instrument ratio of 6 dB or vocal to instrument ratio of 0 dB) did. This finding suggests that the proposed system is a potential method for enhancing the music listening benefits for CI users.


Asunto(s)
Implantación Coclear , Implantes Cocleares , Sordera , Aprendizaje Profundo , Música , Humanos , Sordera/rehabilitación , Percepción Auditiva
2.
Artículo en Inglés | MEDLINE | ID: mdl-37938964

RESUMEN

Dysarthria, a speech disorder often caused by neurological damage, compromises the control of vocal muscles in patients, making their speech unclear and communication troublesome. Recently, voice-driven methods have been proposed to improve the speech intelligibility of patients with dysarthria. However, most methods require a significant representation of both the patient's and target speaker's corpus, which is problematic. This study aims to propose a data augmentation-based voice conversion (VC) system to reduce the recording burden on the speaker. We propose dysarthria voice conversion 3.1 (DVC 3.1) based on a data augmentation approach, including text-to-speech and StarGAN-VC architecture, to synthesize a large target and patient-like corpus to lower the burden of recording. An objective evaluation metric of the Google automatic speech recognition (Google ASR) system and a listening test were used to demonstrate the speech intelligibility benefits of DVC 3.1 under free-talk conditions. The DVC system without data augmentation (DVC 3.0) was used for comparison. Subjective and objective evaluation based on the experimental results indicated that the proposed DVC 3.1 system enhanced the Google ASR of two dysarthria patients by approximately [62.4%, 43.3%] and [55.9%, 57.3%] compared to unprocessed dysarthria speech and the DVC 3.0 system, respectively. Further, the proposed DVC 3.1 increased the speech intelligibility of two dysarthria patients by approximately [54.2%, 22.3%] and [63.4%, 70.1%] compared to unprocessed dysarthria speech and the DVC 3.0 system, respectively. The proposed DVC 3.1 system offers significant potential to improve the speech intelligibility performance of patients with dysarthria and enhance verbal communication quality.


Asunto(s)
Disartria , Voz , Humanos , Disartria/etiología , Inteligibilidad del Habla/fisiología , Músculos Laríngeos
3.
IEEE Trans Biomed Eng ; 70(12): 3330-3341, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37327105

RESUMEN

OBJECTIVE: Although many speech enhancement (SE) algorithms have been proposed to promote speech perception in hearing-impaired patients, the conventional SE approaches that perform well under quiet and/or stationary noises fail under nonstationary noises and/or when the speaker is at a considerable distance. Therefore, the objective of this study is to overcome the limitations of the conventional speech enhancement approaches. METHOD: This study proposes a speaker-closed deep learning-based SE method together with an optical microphone to acquire and enhance the speech of a target speaker. RESULTS: The objective evaluation scores achieved by the proposed method outperformed the baseline methods by a margin of 0.21-0.27 and 0.34-0.64 in speech quality (HASQI) and speech comprehension/intelligibility (HASPI), respectively, for seven typical hearing loss types. CONCLUSION: The results suggest that the proposed method can enhance speech perception by cutting off noise from speech signals and mitigating interference caused by distance. SIGNIFICANCE: The results of this study show a potential way that can help improve the listening experience in enhancing speech quality and speech comprehension/intelligibility for hearing-impaired people.


Asunto(s)
Implantes Cocleares , Aprendizaje Profundo , Audífonos , Pérdida Auditiva , Percepción del Habla , Humanos , Inteligibilidad del Habla
4.
Sensors (Basel) ; 23(5)2023 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-36904641

RESUMEN

Mechanisms underlying exercise-induced muscle fatigue and recovery are dependent on peripheral changes at the muscle level and improper control of motoneurons by the central nervous system. In this study, we analyzed the effects of muscle fatigue and recovery on the neuromuscular network through the spectral analysis of electroencephalography (EEG) and electromyography (EMG) signals. A total of 20 healthy right-handed volunteers performed an intermittent handgrip fatigue task. In the prefatigue, postfatigue, and postrecovery states, the participants contracted a handgrip dynamometer with sustained 30% maximal voluntary contractions (MVCs); EEG and EMG data were recorded. A considerable decrease was noted in EMG median frequency in the postfatigue state compared with the findings in other states. Furthermore, the EEG power spectral density of the right primary cortex exhibited a prominent increase in the gamma band. Muscle fatigue led to increases in the beta and gamma bands of contralateral and ipsilateral corticomuscular coherence, respectively. Moreover, a decrease was noted in corticocortical coherence between the bilateral primary motor cortices after muscle fatigue. EMG median frequency may serve as an indicator of muscle fatigue and recovery. Coherence analysis revealed that fatigue reduced the functional synchronization among bilateral motor areas but increased that between the cortex and muscle.


Asunto(s)
Corteza Motora , Fatiga Muscular , Humanos , Fatiga Muscular/fisiología , Electromiografía , Músculo Esquelético/fisiología , Fuerza de la Mano/fisiología , Electroencefalografía , Corteza Motora/fisiología
5.
J Voice ; 2023 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-36732109

RESUMEN

OBJECTIVE: Doctors, nowadays, primarily use auditory-perceptual evaluation, such as the grade, roughness, breathiness, asthenia, and strain scale, to evaluate voice quality and determine the treatment. However, the results predicted by individual physicians often differ, because of subjective perceptions, and diagnosis time interval, if the patient's symptoms are hard to judge. Therefore, an accurate computerized pathological voice quality assessment system will improve the quality of assessment. METHOD: This study proposes a self_attention-based system, with a deep learning technology, named self_attention-based bidirectional long-short term memory (SA BiLSTM). Different pitches [low, normal, high], and vowels [/a/, /i/, /u/], were added into the proposed model, to make it learn how professional doctors evaluate the grade, roughness, breathiness, asthenia, and strain scale, in a high dimension view. RESULTS: The experimental results showed that the proposed system provided higher performance than the baseline system. More specifically, the macro average of the F1 score, presented as decimal, was used to compare the accuracy of classification. The (G, R, and B) of the proposed system were (0.768±0.011, 0.820±0.009, and 0.815±0.009), which is higher than the baseline systems: deep neural network (0.395±0.010, 0.312±0.019, 0.321±0.014) and convolution neural network (0.421±0.052, 0.306±0.043, 0.3250±0.032) respectively. CONCLUSIONS: The proposed system, with SA BiLSTM, pitches, and vowels, provides a more accurate way to evaluate the voice. This will be helpful for clinical voice evaluations and will improve patients' benefits from voice therapy.

6.
Asia Pac J Ophthalmol (Phila) ; 12(1): 21-28, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36706331

RESUMEN

PURPOSE: The aim was to develop a deep learning model for predicting the extent of visual impairment in epiretinal membrane (ERM) using optical coherence tomography (OCT) images, and to analyze the associated features. METHODS: Six hundred macular OCT images from eyes with ERM and no visually significant media opacity or other retinal diseases were obtained. Those with best-corrected visual acuity ≤20/50 were classified as "profound visual impairment," while those with best-corrected visual acuity >20/50 were classified as "less visual impairment." Ninety percent of images were used as the training data set and 10% were used for testing. Two convolutional neural network models (ResNet-50 and ResNet-18) were adopted for training. The t-distributed stochastic neighbor-embedding approach was used to compare their performances. The Grad-CAM technique was used in the heat map generative phase for feature analysis. RESULTS: During the model development, the training accuracy was 100% in both convolutional neural network models, while the testing accuracy was 70% and 80% for ResNet-18 and ResNet-50, respectively. The t-distributed stochastic neighbor-embedding approach found that the deeper structure (ResNet-50) had better discrimination on OCT characteristics for visual impairment than the shallower structure (ResNet-18). The heat maps indicated that the key features for visual impairment were located mostly in the inner retinal layers of the fovea and parafoveal regions. CONCLUSIONS: Deep learning algorithms could assess the extent of visual impairment from OCT images in patients with ERM. Changes in inner retinal layers were found to have a greater impact on visual acuity than the outer retinal changes.


Asunto(s)
Aprendizaje Profundo , Membrana Epirretinal , Humanos , Membrana Epirretinal/diagnóstico por imagen , Tomografía de Coherencia Óptica/métodos , Retina/diagnóstico por imagen , Trastornos de la Visión/etiología , Estudios Retrospectivos
7.
J Chin Med Assoc ; 86(1): 105-112, 2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36300992

RESUMEN

BACKGROUND: The population of young adults who are hearing impaired increases yearly, and a device that enables convenient hearing screening could help monitor their hearing. However, background noise is a critical issue that limits the capabilities of such a device. Therefore, this study evaluated the effectiveness of commercial active noise cancellation (ANC) headphones for hearing screening applications in the presence of background noise. In particular, six confounders were used for a comprehensive evaluation. METHODS: We enrolled 12 young adults (a total of 23 ears with normal hearing) to participate in this study. A cross-sectional self-controlled study was conducted to explore the effectiveness of hearing screening in the presence of background noise, with a total of 240 test conditions (=3 ANC models × 2 ANC function statuses × 2 noise types × 5 noise levels × 4 frequencies) for each test ear. Subsequently, a linear regression model was used to prove the effectiveness of ANC headphones for hearing screening applications in the presence of background noise with six confounders. RESULTS: The experimental results showed that, on average, the ANC function of headphones can improve the effectiveness of hearing screening tasks in the presence of background noise. Specifically, the statistical analysis showed that the ANC function enabled a significant 10% improvement ( p < 0.001) compared with no ANC function. CONCLUSION: This study confirmed the effectiveness of ANC headphones for young adult hearing screening applications in the presence of background noise. Furthermore, the statistical results confirmed that as confounding variables, noise type, noise level, hearing screening frequency, ANC headphone model, and sex all affect the effectiveness of the ANC function. These findings suggest that ANC is a potential means of helping users obtain high-accuracy hearing screening results in the presence of background noise. Moreover, we present possible directions of development for ANC headphones in future studies.


Asunto(s)
Pérdida Auditiva , Ruido , Adulto Joven , Humanos , Proyectos Piloto , Estudios Transversales , Ruido/prevención & control , Audición
8.
Sensors (Basel) ; 22(19)2022 Sep 27.
Artículo en Inglés | MEDLINE | ID: mdl-36236430

RESUMEN

With the development of active noise cancellation (ANC) technology, ANC has been used to mitigate the effects of environmental noise on audiometric results. However, objective evaluation methods supporting the accuracy of audiometry for ANC exposure to different levels of noise have not been reported. Accordingly, the audio characteristics of three different ANC headphone models were quantified under different noise conditions and the feasibility of ANC in noisy environments was investigated. Steady (pink noise) and non-steady noise (cafeteria babble noise) were used to simulate noisy environments. We compared the integrity of pure-tone signals obtained from three different ANC headphone models after processing under different noise scenarios and analyzed the degree of ANC signal correlation based on the Pearson correlation coefficient compared to pure-tone signals in quiet. The objective signal correlation results were compared with audiometric screening results to confirm the correspondence. Results revealed that ANC helped mitigate the effects of environmental noise on the measured signal and the combined ANC headset model retained the highest signal integrity. The degree of signal correlation was used as a confidence indicator for the accuracy of hearing screening in noise results. It was found that the ANC technique can be further improved for more complex noisy environments.


Asunto(s)
Tamizaje Masivo , Ruido , Audiometría de Tonos Puros/métodos , Estudios de Factibilidad , Audición
9.
Artículo en Inglés | MEDLINE | ID: mdl-36085875

RESUMEN

Generally, those patients with dysarthria utter a distorted sound and the restrained intelligibility of a speech for both human and machine. To enhance the intelligibility of dysarthric speech, we applied a deep learning-based speech enhancement (SE) system in this task. Conventional SE approaches are used for shrinking noise components from the noise-corrupted input, and thus improve the sound quality and intelligibility simultaneously. In this study, we are focusing on reconstructing the severely distorted signal from the dysarthric speech for improving intelligibility. The proposed SE system prepares a convolutional neural network (CNN) model in the training phase, which is then used to process the dysarthric speech in the testing phase. During training, paired dysarthric-normal speech utterances are required. We adopt a dynamic time warping technique to align the dysarthric-normal utter-ances. The gained training data are used to train a CNN - based SE model. The proposed SE system is evaluated on the Google automatic speech recognition (ASR) system and a subjective listening test. The results showed that the proposed method could notably enhance the recognition performance for more than 10% in each of ASR and human recognitions from the unprocessed dysarthric speech. Clinical Relevance- This study enhances the intelligibility and ASR accuracy from a dysarthria speech to more than 10.


Asunto(s)
Disartria , Habla , Percepción Auditiva , Disartria/diagnóstico , Humanos , Redes Neurales de la Computación , Sonido
10.
Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 1972-1976, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-36086160

RESUMEN

Envelope waveforms can be extracted from multiple frequency bands of a speech signal, and envelope waveforms carry important intelligibility information for human speech communication. This study aimed to investigate whether a deep learning-based model with features of temporal envelope information could synthesize an intelligible speech, and to study the effect of reducing the number (from 8 to 2 in this work) of temporal envelope information on the intelligibility of the synthesized speech. The objective evaluation metric of short-time objective intelligibility (STOI) showed that, on average, the synthesized speech of the proposed approach provided higher STOI (i.e., 0.8) scores in each test condition; and the human listening test showed that the average word correct rate of eight listeners was higher than 97.5%. These findings indicated that the proposed deep learning-based system can be a potential approach to synthesize a highly intelligible speech with limited envelope information in the future.


Asunto(s)
Aprendizaje Profundo , Percepción del Habla , Percepción Auditiva , Humanos , Inteligibilidad del Habla , Factores de Tiempo
11.
JASA Express Lett ; 2(5): 055202, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-36154065

RESUMEN

Medical masks have become necessary of late because of the COVID-19 outbreak; however, they tend to attenuate the energy of speech signals and affect speech quality. Therefore, this study proposes an optical-based microphone approach to obtain speech signals from speakers' medical masks. Experimental results showed that the optical-based microphone approach achieved better performance (85.61%) than the two baseline approaches, namely, omnidirectional (24.17%) and directional microphones (31.65%), in the case of long-distance speech and background noise. The results suggest that the optical-based microphone method is a promising approach for acquiring speech from a medical mask.


Asunto(s)
COVID-19 , Audífonos , Percepción del Habla , COVID-19/prevención & control , Diseño de Equipo , Humanos , Máscaras , Habla , Vibración
12.
EClinicalMedicine ; 46: 101378, 2022 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-35434580

RESUMEN

Background: Hearing loss is a common morbidity that requires a hearing device to improve quality of life and prevent sequelae, such as dementia, depression falls, and cardiovascular disease. However, conventional hearing aids have some limitations, including poor accessibility and unaffordability. Consequently, personal sound amplification products (PSAPs) are considered a potential first-line alternative remedy for patients with hearing loss. The main objective of this study was to compare the efficacy of PSAPs and conventional hearing aids regarding hearing benefits in patients with hearing loss. Methods: This systematic review and meta-analysis followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Five databases and reference lists were searched from inception to January 12, 2022. Studies including randomised, controlled trials; nonrandomised, controlled trials; or observational studies comparing PSAPs and hearing aids with regard to hearing gain performance (e.g., speech intelligence) were considered eligible. The review was registered prospectively on PROSPERO (CRD42021267187). Findings: Of 599 records identified in the preliminary search, five studies were included in the review and meta-analysis. A total of 124 patients were divided into the PSAP group and the conventional hearing aid group. Five studies including seven groups compared differences for speech intelligence in the signal-noise ratio (SNR) on the hearing in noise test (HINT) between PSAPs and conventional hearing aids. The pooled results showed nonsignificant differences in speech intelligence (SMD, 0.14; 95% CI, -0.19 to 0.47; P = .41; I 2=65%), sound quality (SMD, -0.37; 95% CI, -0.87 to 0.13; P = .15; I 2=77%) and listening effort (SMD 0.02; 95% CI, -0.24 to 0.29; P = .86; I 2=32%). Nonsignificant results were also observed in subsequent analyses after excluding patients with moderately severe hearing loss. Complete sensitivity analyses with all of the possible combinations suggested nonsignificant results in most of the comparisons between PSAPs and conventional hearing aids. Interpretation: PSAPs are potentially beneficial as conventional hearing aids are in patients with hearing loss. The different features among PSAPs should be considered for patients indicated for hearing devices. Funding: This work was supported by grants from Ministry of Science and Technology (MOST-10-2622-8-075-001) and Veterans General Hospitals and University System of Taiwan Joint Research Program (VGHUST111-G6-11-2 and VGHUST111c-140).

13.
Diagnostics (Basel) ; 12(4)2022 Apr 13.
Artículo en Inglés | MEDLINE | ID: mdl-35454020

RESUMEN

Traditional otoscopy has some limitations, including poor visualization and inadequate time for evaluation in suboptimal environments. Smartphone-enabled otoscopy may improve examination quality and serve as a potential diagnostic tool for middle ear diseases using a telemedicine approach. The main objectives are to compare the correctness of smartphone-enabled otoscopy and traditional otoscopy and to evaluate the diagnostic confidence of the examiner via meta-analysis. From inception through 20 January 2022, the Cochrane Library, PubMed, EMBASE, Web of Science, and Scopus databases were searched. Studies comparing smartphone-enabled otoscopy with traditional otoscopy regarding the outcome of interest were eligible. The relative risk (RR) for the rate of correctness in diagnosing ear conditions and the standardized mean difference (SMD) in diagnostic confidence were extracted. Sensitivity analysis and trial sequential analyses (TSAs) were conducted to further examine the pooled results. Study quality was evaluated by using the revised Cochrane risk of bias tool 2. Consequently, a total of 1840 examinees were divided into the smartphone-enabled otoscopy group and the traditional otoscopy group. Overall, the pooled result showed that smartphone-enabled otoscopy was associated with higher correctness than traditional otoscopy (RR, 1.26; 95% CI, 1.06 to 1.51; p = 0.01; I2 = 70.0%). Consistently significant associations were also observed in the analysis after excluding the simulation study (RR, 1.10; 95% CI, 1.00 to 1.21; p = 0.04; I2 = 0%) and normal ear conditions (RR, 1.18; 95% CI, 1.01 to 1.40; p = 0.04; I2 = 65.0%). For the confidence of examiners using both otoscopy methods, the pooled result was nonsignificant between the smartphone-enabled otoscopy and traditional otoscopy groups (SMD, 0.08; 95% CI, -0.24 to 0.40; p = 0.61; I2 = 16.3%). In conclusion, smartphone-enabled otoscopy was associated with a higher rate of correctness in the detection of middle ear diseases, and in patients with otologic complaints, the use of smartphone-enabled otoscopy may be considered. More large-scale studies should be performed to consolidate the results.

14.
Comput Methods Programs Biomed ; 215: 106602, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-35021138

RESUMEN

BACKGROUND AND OBJECTIVE: Most dysarthric patients encounter communication problems due to unintelligible speech. Currently, there are many voice-driven systems aimed at improving their speech intelligibility; however, the intelligibility performance of these systems are affected by challenging application conditions (e.g., time variance of patient's speech and background noise). To alleviate these problems, we proposed a dysarthria voice conversion (DVC) system for dysarthric patients and investigated the benefits under challenging application conditions. METHOD: A deep learning-based voice conversion system with phonetic posteriorgram (PPG) features, called the DVC-PPG system, was proposed in this study. An objective-evaluation metric of Google automatic speech recognition (Google ASR) system and a listening test were used to demonstrate the speech intelligibility benefits of DVC-PPG under quiet and noisy test conditions; besides, the well-known voice conversion system using mel-spectrogram, DVC-Mels, was used for comparison to verify the benefits of the proposed DVC-PPG system. RESULTS: The objective-evaluation metric of Google ASR showed the average accuracy of two subjects in the duplicate and outside test conditions while the DVC-PPG system provided higher speech recognitions rate (83.2% and 67.5%) than dysarthric speech (36.5% and 26.9%) and DVC-Mels (52.9% and 33.8%) under quiet conditions. However, the DVC-PPG system provided more stable performance than the DVC-Mels under noisy test conditions. In addition, the results of the listening test showed that the speech-intelligibility performance of DVC-PPG was better than those obtained via the dysarthria speech and DVC-Mels under the duplicate and outside conditions, respectively. CONCLUSIONS: The objective-evaluation metric and listening test results showed that the recognition rate of the proposed DVC-PPG system was significantly higher than those obtained via the original dysarthric speech and DVC-Mels system. Therefore, it can be inferred from our study that the DVC-PPG system can improve the ability of dysarthric patients to communicate with people under challenging application conditions.


Asunto(s)
Inteligibilidad del Habla , Voz , Disartria , Humanos , Fonética , Medición de la Producción del Habla
15.
iScience ; 25(12): 105436, 2022 Dec 22.
Artículo en Inglés | MEDLINE | ID: mdl-36590464

RESUMEN

Given the low prevalence of hearing aid use among individuals with hearing loss due to their high costs and social stigma, personal sound amplification products (PSAPs) may serve as alternatives with adequate hearing compensation and greater accessibility. This study examined the electroacoustic features of hearing aids and selected smartphone-bundled earphones, specifically AirPods, as PSAPs, and compared hearing performances among adults with mild-to-moderate hearing loss when aided with each hearing assistive device. Our results indicated that AirPods Pro met four out of five PSAP standards. No significant differences were found regarding speech perception between AirPods Pro and hearing aids in quiet but not with the presence of background noises. AirPods Pro may have the potential to be a hearing assistive device for adults with mild-to-moderate hearing loss. More research is needed to investigate the safety and feasibility of using earphones bundled with other smartphones as PSAPs.

16.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 131-134, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34891255

RESUMEN

The effective classification for imagined speech and intended speech is of great help to the development of speech-based brain-computer interfaces (BCIs). This work distinguished imagined speech and intended speech by employing the cortical EEG signals recorded from scalp. EEG signals from eleven subjects were recorded when they produced Mandarin-Chinese monosyllables in imagined speech and intended speech, and EEG features were classified by the common spatial pattern, time-domain, frequency-domain and Riemannian manifold based methods. The classification results indicated that the Riemannian manifold based method yielded the highest classification accuracy of 85.9% among the four classification methods. Moreover, the classification accuracy with the left-only brain electrode configuration was close to that with the whole brain electrode configuration. The findings of this work have potential to extend the output commands of silent speech interfaces.


Asunto(s)
Interfaces Cerebro-Computador , Habla , Electrodos , Electroencefalografía , Humanos , Registros
17.
J Med Internet Res ; 23(10): e25460, 2021 10 28.
Artículo en Inglés | MEDLINE | ID: mdl-34709193

RESUMEN

BACKGROUND: Cochlear implant technology is a well-known approach to help deaf individuals hear speech again and can improve speech intelligibility in quiet conditions; however, it still has room for improvement in noisy conditions. More recently, it has been proven that deep learning-based noise reduction, such as noise classification and deep denoising autoencoder (NC+DDAE), can benefit the intelligibility performance of patients with cochlear implants compared to classical noise reduction algorithms. OBJECTIVE: Following the successful implementation of the NC+DDAE model in our previous study, this study aimed to propose an advanced noise reduction system using knowledge transfer technology, called NC+DDAE_T; examine the proposed NC+DDAE_T noise reduction system using objective evaluations and subjective listening tests; and investigate which layer substitution of the knowledge transfer technology in the NC+DDAE_T noise reduction system provides the best outcome. METHODS: The knowledge transfer technology was adopted to reduce the number of parameters of the NC+DDAE_T compared with the NC+DDAE. We investigated which layer should be substituted using short-time objective intelligibility and perceptual evaluation of speech quality scores as well as t-distributed stochastic neighbor embedding to visualize the features in each model layer. Moreover, we enrolled 10 cochlear implant users for listening tests to evaluate the benefits of the newly developed NC+DDAE_T. RESULTS: The experimental results showed that substituting the middle layer (ie, the second layer in this study) of the noise-independent DDAE (NI-DDAE) model achieved the best performance gain regarding short-time objective intelligibility and perceptual evaluation of speech quality scores. Therefore, the parameters of layer 3 in the NI-DDAE were chosen to be replaced, thereby establishing the NC+DDAE_T. Both objective and listening test results showed that the proposed NC+DDAE_T noise reduction system achieved similar performances compared with the previous NC+DDAE in several noisy test conditions. However, the proposed NC+DDAE_T only required a quarter of the number of parameters compared to the NC+DDAE. CONCLUSIONS: This study demonstrated that knowledge transfer technology can help reduce the number of parameters in an NC+DDAE while keeping similar performance rates. This suggests that the proposed NC+DDAE_T model may reduce the implementation costs of this noise reduction system and provide more benefits for cochlear implant users.


Asunto(s)
Implantación Coclear , Implantes Cocleares , Percepción del Habla , Humanos , Ruido , Inteligibilidad del Habla
18.
Sensors (Basel) ; 22(1)2021 Dec 31.
Artículo en Inglés | MEDLINE | ID: mdl-35009834

RESUMEN

Human motion tracking is widely applied to rehabilitation tasks, and inertial measurement unit (IMU) sensors are a well-known approach for recording motion behavior. IMU sensors can provide accurate information regarding three-dimensional (3D) human motion. However, IMU sensors must be attached to the body, which can be inconvenient or uncomfortable for users. To alleviate this issue, a visual-based tracking system from two-dimensional (2D) RGB images has been studied extensively in recent years and proven to have a suitable performance for human motion tracking. However, the 2D image system has its limitations. Specifically, human motion consists of spatial changes, and the 3D motion features predicted from the 2D images have limitations. In this study, we propose a deep learning (DL) human motion tracking technology using 3D image features with a deep bidirectional long short-term memory (DBLSTM) mechanism model. The experimental results show that, compared with the traditional 2D image system, the proposed system provides improved human motion tracking ability with RMSE in acceleration less than 0.5 (m/s2) X, Y, and Z directions. These findings suggest that the proposed model is a viable approach for future human motion tracking applications.


Asunto(s)
Imagenología Tridimensional , Memoria a Corto Plazo , Humanos , Movimiento (Física)
19.
J Chin Med Assoc ; 84(1): 101-107, 2021 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-33177402

RESUMEN

BACKGROUND: Idiopathic sudden sensorineural hearing loss (ISSNHL) is an emergency disease, and its pathogenesis is still largely unknown, making it difficult to diagnose and develop a therapeutic strategy. To predict the treatment outcomes and further customize the treatment strategy, we used a physician decision support system that incorporates complex information from electronic health records. We first developed the infrastructure of the physician decision support system, including an integrated data warehouse and an automatic data de-identification workflow. METHODS: We next conducted a cohort study to evaluate the treatment outcomes of 757 ISSNHL patients using the modified Siegel's criteria. The complete recovery (<25 dB) as a hearing outcome for ISSNHL patients was compared based on pretreatment hearing grades and disease onset with adjusted for age and sex after treatment initiation. RESULTS: The results showed that a complete recovery hearing outcome based on pretreatment hearing grades and disease onset in consideration of age and sex was associated with a low risk of pretreatment hearing Grade 2 (26-45 dB) (adjusted odds ratio 12.3, 95% confidence interval [CI]: 4.8-31.3) and disease onset ≤7 days (adjusted odds ratio 13.9, 95% CI: 4.2-45.8), respectively. Hearing recovery outcomes among complete recovery and noncomplete recovery (>25 dB) subjects according to pretreatment hearing grades were 32.9% (Grade 2 or 26-45 dB HL), 25.4% (Grade 3 or 46-75 dB HL), 31.1% (Grade 4 or 76-90 dB), and 4.5% (Grade 5, or >90 dB HL) (p < 0.0001). Patients with pretreatment hearing Grade 2 who received treatment within ≤7 days of disease onset had the highest rate of complete recovery (32.9%, 23/70). CONCLUSION: In summary, using the physician decision support system, we successfully identified two predictors, the pretreatment hearing Grade 2 (26-45 dB) and treatment within ≤7 days of disease onset, associated with the highest odds of achieving complete recovery (<25 dB) of hearing in patients with ISSNHL.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Pérdida Auditiva Sensorineural/terapia , Pérdida Auditiva Súbita/terapia , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Audición , Pérdida Auditiva Sensorineural/diagnóstico , Pérdida Auditiva Sensorineural/fisiopatología , Pérdida Auditiva Súbita/diagnóstico , Pérdida Auditiva Súbita/fisiopatología , Humanos , Masculino , Persona de Mediana Edad , Adulto Joven
20.
JMIR Mhealth Uhealth ; 8(12): e16746, 2020 12 03.
Artículo en Inglés | MEDLINE | ID: mdl-33270033

RESUMEN

BACKGROUND: Voice disorders mainly result from chronic overuse or abuse, particularly in occupational voice users such as teachers. Previous studies proposed a contact microphone attached to the anterior neck for ambulatory voice monitoring; however, the inconvenience associated with taping and wiring, along with the lack of real-time processing, has limited its clinical application. OBJECTIVE: This study aims to (1) propose an automatic speech detection system using wireless microphones for real-time ambulatory voice monitoring, (2) examine the detection accuracy under controlled environment and noisy conditions, and (3) report the results of the phonation ratio in practical scenarios. METHODS: We designed an adaptive threshold function to detect the presence of speech based on the energy envelope. We invited 10 teachers to participate in this study and tested the performance of the proposed automatic speech detection system regarding detection accuracy and phonation ratio. Moreover, we investigated whether the unsupervised noise reduction algorithm (ie, log minimum mean square error) can overcome the influence of environmental noise in the proposed system. RESULTS: The proposed system exhibited an average accuracy of speech detection of 89.9%, ranging from 81.0% (67,357/83,157 frames) to 95.0% (199,201/209,685 frames). Subsequent analyses revealed a phonation ratio between 44.0% (33,019/75,044 frames) and 78.0% (68,785/88,186 frames) during teaching sessions of 40-60 minutes; the durations of most of the phonation segments were less than 10 seconds. The presence of background noise reduced the accuracy of the automatic speech detection system, and an adjuvant noise reduction function could effectively improve the accuracy, especially under stable noise conditions. CONCLUSIONS: This study demonstrated an average detection accuracy of 89.9% in the proposed automatic speech detection system with wireless microphones. The preliminary results for the phonation ratio were comparable to those of previous studies. Although the wireless microphones are susceptible to background noise, an additional noise reduction function can alleviate this limitation. These results indicate that the proposed system can be applied for ambulatory voice monitoring in occupational voice users.


Asunto(s)
Acústica del Lenguaje , Trastornos de la Voz , Algoritmos , Humanos , Fonación , Habla
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...